A novel approach to the Orienteering Problem based on the Harmony Search algorithm

This article presents a new approach to designing a Harmony Search (HS) algorithm, adapted to solve Orienteering Problem (OP) instances. OP is a significant NP-hard problem that has considerable practical application, requiring the development of an effective method for determining its solutions. The proposed HS has demonstrated its effectiveness through determined optimum results for each task from the six most popular benchmarks; a marginal number approximated the best results, with the average error below 0.01%. The article details the application of this described algorithm, comparing its results with those of state-of-the-art methods, indicating the significant efficiency of the proposed approach.


Introduction
The Orienteering Problem (OP) is an N P-hard [1] optimization problem that assumes the need to maximising the number of points received due to visiting particular nodes within a pre-defined time limit. According to [2], this can be considered a combination of two issues that are significant from a utilitarian perspective: the Knapsack Problem and the Traveling Salesman Problem. Connecting features of these problems enables application of the OP in economic practice and optimizes many processes occurring in the logistics area of interest, such as planning a traveling salesman route, where time limits prevent the salesman from visiting all of the towns [3], solving tourist-route design problems [4], and analyzing other inventory or routing problems [1]. It should also be emphasized that one OP variant is distinct from the utilitarian persperctive beacuse it includes M of team members (i.e., the Team Orienteering Problem (TOP)). This is among the Vehicle Routing Problems with Profits (VRPP) variants [5] that are characterized by the need to select a subset of customers, describe them in terms of profit, and design routes according to which the sum of the profits gained is maximized under time (distance) constraints. Among these variants, we can also distinguish the Capacitated Profitable Tour Problem (assuming maximization of profits minus the travel costs under capacity restrictions) and a variant featuring a private fleet and common carrier, in which some customers may be delegated to another carrier subject to a cost [5].
Because this issue pertains to the class of N P-hard problems, approximate approaches are often applied to solve OP instances, precluding the design of optimal travel routes within an acceptable timeframe (according to [2], using exact algorithms is too time-consuming for the a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 context of solving similar optimization problems, can enable the development of a method that can construct more favorable solutions than other metaheuristics commonly used to solve OP instances. In this way, we want to contribute by proposing a method that can be successfully used in business practice to solve large OP instances, the complexity of which makes it impossible to apply exact algorithms. Among the research adapting HS to solve various OP variants, we can distinguish the paper [32] and [33], which propose variants of the algorithm designed to approximate the Generalized Orienteering Problem. Elsewhere, Tsakirakis et al. has described an interesting Similarity Hybrid HS algorithm for the TOP [34]. However, this paper's approach is based on the idea proposed in [30], which achieved satisfactory results for a similar optimization problem (i.e., the ATSP).
This article comprises seven sections. After the introduction to the subject matter, the HS's characteristic features are presented. Next, we describe the OP and present the HS structure as adjusted to solve the OP. After explaining the research methodology in more detail, we discuss the results obtained. The paper the details the study's conclusions and plans for further research.
The results presented in the article have been published in the doctoral dissertation [35].

Description of the classical Harmony Search
The HS is a modern metaheuristic proposed in [20] and based on the similarity of musical improvisation to the search for the optimum in optimization processes. The algorithm's author transcribed real-world observations onto the HS diagram by implementing the following parameters and structures: • Pitch: an element that corresponds to the decision variable of the given optimization problem.
• Harmony: a structure comprising n pitches (where n represents the number of decision variables describing the given problem) and which is a complete solution to the problem.
• HM (Harmony Memory): a structure storing harmonies sorted according to their objective function value (the worst element is positioned last).
• HMS (Harmony Memory Size): the size of the HM (i.e., the number of harmonies stored in the HM).
• HMCR (Harmony Memory Consideration Rate): the probability of playing the pitch by heart. With a probability equal to the HMCR, the choice of pitch value takes place in position i in the created solution, based on the value of pitches in position i, describing harmonies stored in HM. With a probability of 1 − HMCR the random value of pitch i is selected from the acceptable range.
• PAR (Pitch Adjustment Rate): the probability of modification of the pitch value in the i position on the basis of the value of pitches in position i in HM.
• bw (Bandwidth): the range of pitch value changes relative to the parameter PAR, with a value corresponding to the maximum change value of the chosen pitch in position i.
• IT: it defines the condition in which the algorithm stops (interpreted as the number of harmonies created in the main method loop).
At the beginning of an HS operation, HMS harmonies are created and, included in the HM. They are subsequently sorted based on objective function value, such that the worst harmony is found in the last position.
According to [36], improvisation in music assumes the aim of better harmonization involves trying various pitch combinations according to the following possibilities: 1. Playing the pitch by heart.
2. Modifying the pitch by heart.
3. Playing a random pitch from within the acceptable range.
Based on music's improvisation process, Geem assumed that the algorithm must iteratively create subsequent harmonies, choosing the value of the next pitch i according to the following rules: 1. The choice of pitch i, based on the value in position i in HM (probability equals HMCR � (1 − PAR)).
2. The choice of modified pitch i, based on the value in position i in HM (probability equals HMCR � PAR).
3. The choice of pseudorandom value from the available range (probability equals 1 − HMCR).
A new harmony's objective function value is compared with the applicable value describing the worst element located in the HM. If the new solution is more beneficial, it replaces the worst harmony stored in the HM, after which the elements of the HM are sorted again. The procedure is repeated for IT iterations. However, the HS has caused considerable controversy within the research community. For example, according to Weyland, the algorithm is simply represents a special case of the Evolution Strategies (ES) and does not offer any novelty [37]. The author conducted a detailed analysis to demonstrate the imperfections in the results presented by Geem in [22]. However, in [38] Kim responded Weyland's allegations by demonstrating the differences between the HS and ES.

Formulation of the Orienteering Problem
This article is based on the formulation of the OP presented by Vansteenwegen et al. [2] and assumes the occurrence of a set of N nodes and assigns a number of points S i to each i node. After stablishing the beginning node (marked 1), ending node (marked N), travel time between nodes i and j (for all nodes), t ij , and the time limit T max are set, the decision variables are introduced: x ij (which assumes a value of 1 if node j is visited after node i and 0 otherwise) and u i (which indicates the position of node i in the path).
The objective function of the problem, assuming the maximization of collected points, is determined according to the following: The limiting condition, ensuring that the path starts with Node 1 and ends at node N, is as follows: The condition enforcing the creation of connections in the path and guaranteeing that each node is not visited more than once is determined according to the following: The need to comply with the time limit T max is determined as follows: To eliminate subtours, the following is introduced: Additionally, the following is assumed:

Harmony Search adjusted to solve the Orienteering Problem
The OP solution comprises a sequence of points that must be visited in a given order. Given the similarity between the OP and ATSP solutions, we decided that the natural representation of the HM would be the path to be visited, in the order in which individual pitches appear. The process of choosing the value of the next pitch assumes the following: 1. Choosing pitch values is based on the elements of the HM (performed with the probability HMCR � (1 − PAR)), indicating that preparing the list of nodes, which occur in harmonies located in the HM directly after the last node added to a new harmony H (for the OP, as in the case of the ATSP, we assumed that it was more important to analyze the successors and predecessors of the node in the path than the absolute order in which the node was visited).
In the example presented in Fig 4, the nodes occurring directly after Node 4 are analyzed. Although Node 6 and 3 appear in HM[0] and HM [2], they cannot be added to the new solution due to occuring in H. Only Node 2, which exists in HM [1], can potentially be added to the list created. However, before adding it to the list, it is necessary to check the possibility of managing it and, subsequently, point N (in example Node 5) without infringing on the limitation T max .

PLOS ONE
Next, following the approach described by Komaki et al. [39], a node belonging to the list is chosen based on the objective function value of the harmony it derives from, using the roulette wheel selection method (thus increasing the method's exploitation ability). In cases in which the list of nodes is empty, it is completed with the available nodes-i.e., the nodes not yet visited; after visiting, it will be possible to collect points in node N, considering the time limit T max -and by determining the quotient g for the number of points achieved (for each) by visiting the node compared to the distance (according to travel time) between the last visited node and the current one. The HMS nodes from the list, with the greatest g values subsequently, participate in the roulette wheel method selection (if the list contains fewer elements than HMS, all nodes are considered).
2. Modifying pitch value (executed with a probability equal to HCMR � PAR) is based on the modified heuristics proposed by Golden et al. [40] in an article including an analysis of the effect of individual component weights on heuristics efficiency that demonstrates how knowledge about the problem is used to increase the entire technique's efficiency. In the HS discussed the designation for each available node i assumes the value: where: • Rs i ranks the number of points obtained at node i among the points describing available nodes.
• Rc i ranks the distance of node i (with coordinates (x i ,y i )) from the center of gravity of available nodes (creating the set A), the coordinates of which derive from the formula: • Rp i ranks the distance of node i from the last visited node. Ranks are employed to limit the impact of various ranges of values of particular features describing nodes upon the value W i . It was assumed that ranks are given based on sorted

PLOS ONE
values in the ascending order for Rc i as well as Rp i -meaning the nearest nodes receive the rank 1, the next nearest receive the rank 2, and so on, and if two nodes are described by the same value, they receive the same rank, and the next node in line receives a rank that is increased by one-and in the descending order for Rs i , meaning rank 1 is assigned to the node(s) with the greatest number of points. Later ranks are normalized to one range for Rc i , Rp i , and Rs i . After receiving the values of W i for each available node, the required pitch value is chosen by employing the roulette wheel method (promoting lower W i values) for the maximum number of HMS nodes with the lowest value of W i .
3. Choosing (with a probability equal to 1-HMCR) an available pseudorandom node from the set of such nodes infers that it remains possible to visit the node N after visiting one of them without infringing on the time limit T max . After creating a new solution H, it is necessary to determine whether it is characterized by a more beneficial objective function value than the worst harmony stored in HM. In such cases, the procedure (based on the heuristics proposed by Golden et al. [1] and Chao et al. [6]) for improving it begins-building on the previous demonstration of the significant efficiency of this process at this location within the HS structure [41]-following four steps: 1. Optimizing a new route by means of the method 2-opt and using it to replace the worst solution located in HM.
2. Removing from the route node i (which cannot be the starting and ending node), which features the lowest value of the quotient m: where: • matrix[p][i] represents the distance (travel time) between node p and i, • p is a node preceding node i in the analyzed route plan, and • f is a node that follows the node i in the analyzed route plan.
3. It is necessary to add as many available points as possible (for the best possible positions) to the route created in Step 2, beginning with those, whose appearance in the route enables the achievement of the largest possible value of quotient m (it is necessary to check value m for each point in each possible route position).

If the route was improved in
Step 3 relative to the base solution, H, the plan needs to be optimized again using the 2-opt method, with this optimization replacing the solution added to HM in Step 1.
After executing R iterations from the last replacement of the worst solution, there is another draw of HM content (excluding the best solution) to avoid premature convergence. The efficiency of the mechanism for ATSP has previously been demonstrated in [42], indicating that by choosing the appropriate R value, we can preserve the diversity of HM content, maintaining an appropriate level of solution space exploartion. Fig 6 presents the HS pseudocode adjusted to solve OP instances.

Research methodology
The decision was made to use a 'test bed' in the form of four generally accessible sets of Tsiligirides tasks (where Set 4 is Set 1 with a correction based on [6]; the coordinates of one point (4.90, 18.90) were replaced with (4.90, 14.90)), Chao sets 64 and 66, and selected 45 tasks from

PLOS ONE
Generation 1 of OPLib (analysed by [11]). Each task was solved 30 times using a different seed.

PLOS ONE
The quality of solutions was assessed by measuring the number of iterations required to achieve convergence it (the average value was designated it, and sample standard deviation as σ it ), the time required to achieve convergence t [s] (the average value was designated as t, the sample standard deviation as σ t ), and the average error value as e (formula (11)), the best result's error value r (formula (12)), and the worst (w) and the best (b) determined objective function values on the basis of 30 repetitions of the algorithm. e ¼ ðoptimum À averageÞ optimum � 100%: ð11Þ The algorithm was implemented in C#, and the tests were performed using the laptop Lenovo Legion Y520, with the following components: Windows 10 Home 64-bit, 32 GB RAM (SO-DIMM DDR4, 2400MHz) and Intel Core i7-7700HQ (4 cores, 2.8 GHz to 3.8 GHz, 6 MB cache).
In order to verify the effectiveness of the hybridization of the HS algorithm with 2-opt, the tasks from the Tsiligirides and Chao sets were solved by means of a variant without the solution improvement step using 2-opt. The obtained mean value of the objective function for the variant without 2-opt was denoted as e l . Additionally, the obtained objective function values were subjected to the Wilcoxon Signed-Rank Test, using R. Both variants (noted as V1 and V2) were subjected to the process, whereas the value of 0.05 was assumed as the level of test significance (the obtained p-values described with the lower result indicate to accept the alternative hypothesis according to which V1 obtained greater results than V2).
The objective function values achieved by the HS were compared with the results obtained by the following algorithms according to the respective bibliographical sources:

Results
The results achieved by the HS are presented in Tables 1-7 for Tsiligirides sets 1, 2, 3 and 4 and Chao sets 64 and 66, and tasks from OPLib respectively. The attribute Best represents the best objective function value obtained in the papers reviewed (see, in particular [7,8,10,11,13]) and was considered to calculate the values r and e. These results support the significant efficiency of the proposed HS showing that it was able to determine the best routes for all 107 instances of the problem from Tsiligirides and Chao sets (r is equal to 0%).
For tasks from the Tsiligirides sets, the HS achieved the best solutions in all cases (both r and e amount to 0%), with the time to achieve convergence characterized by small values (for

PLOS ONE
the vast majority of analysed values T max , t amounted to less than one second), indicating the problem-free possibility of applying the method to solve practical problems with characteristics that align with the instances analyzed. Meanwhile, analyzing it and σ it confirm the possibility of limiting the value of IT while maintaining the quality of the solutions constructed. For bigger instances of the OP problem, from the sets of Chao 64 and 66, the HS usually obtained the best solutions. However, for some values T max (55 for Chao 64 and 100, 110 and 120 for Chao 66), suboptimal route plans were also constructed, characterized by slightly worse objective function value (acceptable for many practical applications). The time required to achieve convergence t increased to more than 40 seconds on average. Additionally, the number of iterations required to achieve convergence by the technique increased. These observations may indicate the need to improve the method, through a better choice of value of the parameters or replacement of the local search technique (2-opt) with the algorithm characterized by better efficiency.
Based on the comparison of the average error for the tasks from the Tsiligirides and Chao sets achieved by the HS and HS without 2-opt (presented in Fig 7) and Wilcoxon test results (shown in the Table 8), significant effectiveness of the hybrid approach was found and its use is recommended.
Tasks derived from OPLib are characterized by different sizes, enabling the determination of the behavior of the proposed algorithm for different OP instances. For tasks consisting of

PLOS ONE
fewer nodes than 150, very high HS efficiency can be observed (only for two tasks out of 26 no optimal solution was designated). However, as the number of nodes increased, the algorithm began to obtain more and more near-optimal results, which indicates the need to adjust the HS parameter values for larger instances. Both the average error obtained and the time to achieve convergence testify to the possibility of using HS to solve many practical problems. Table 9 compares the HS results with the results presented in the literature on the subject, revealing that the proposed algorithm is much more efficient and can compete with the best state-of-the-art methods. The HS achieved the most advantageous average value w for Chao 64, with less beneficial results only observed for Chao 66 (the average value w was worse than the average values achieved by both for the ACO algorithm and the GRASP with PR approach).
Comparison of results obtained by HS and state-of-the-art algorithms for tasks from OPLib is presented in Table 10 (optimal results are in bold). On their basis, we found significant efficiency of our approach, which gained an advantage over 2-Parameter IA and EA4OP for tasks, characterized by the occurrence of up to 144 nodes. For larger tasks, the method obtains slightly worse results than competing approaches, but this difference could be eliminated by further improving the technique (in this paper we focused on choosing parameter values so that the algorithm copes well with tasks of moderate size). It is worth noting that the average objective function value for HS was higher than for the 2-Parameter IA.

Conclusions and further research plans
The proposed modifications enabled the design of an efficient HS variant that achieved solutions characterized by an average error below the 0.01% level for the classic benchmark. For all 107 OP instances-derived from the six most-popular sets of tasks-the proposed approach was able to construct optimum route plans that, along with the short time of reaching convergence and favorable results when compared with the state-of-the-art methods, confirm the possibility of using the prepared algorithm in economic practice.
For the tasks from OPLib, our algorithm has proved to be highly effective, in particular for tasks with fewer nodes than 150 (it obtained better results than 2-Parameter IA and EA4OP).  For larger tasks, the method obtains slightly worse results than competing approaches, but this difference could be eliminated by further improving the technique. Future work on this subject should check the efficiency of the HS in the context of using other local search techniques (other than 2-opt; e.g. 3-opt, Lin-Kernighan heuristic [46] or Lin-Kernighan-Helsgaun algorithm [47]) and places and criteria (e.g., allowing improving of slightly worse solutions from the worst HM harmony) of its occurrence is recommended (studies on the influence of the localization of the local search algorithm in HS are presented in the paper [41]). Furthemore, additional tests should be performed, including the assignment of optimal values of algorithm parameters and checking the effectiveness of the method on other tasks from OPLib. Finally, it would be worthwile adjusting the method to address extended OP variants (e.g., OP with time windows [48] and TOP [49]).